First-Order Multi-class Subgroup Discovery
نویسندگان
چکیده
Subgroup discovery is concerned with finding subsets of a population whose class distribution is significantly different from the overall distribution. Previously subgroup discovery has been predominantly investigated under the propositional logic framework. This paper investigates multi-class subgroup discovery in an inductive logic programming setting, where subgroups are defined by conjunctions in first-order logic. We present a new weighted covering algorithm, inspired by the Aleph first-order rule learner, that uses seed examples in order to learn diverse, representative and highly predictive subgroups that capture interesting patterns across multiple classes. Our approach experimentally shows considerable and statistically significant improvement of predictive power, both in terms of accuracy and AUC, and theory construction time, by considering fewer hypotheses.
منابع مشابه
The Advantages of Seed Examples in First-Order Multi-class Subgroup Discovery
Subgroup discovery is halfway between predictive and descriptive rule learning: while there is a target concept, the goal of subgroup discovery is not necessarily to achieve high accuracy in predicting the target, but rather to identify subsets of the population whose class distribution is significantly different from the overall distribution. The target concept helps us to achieve a trade-off ...
متن کاملEvaluation Measures for Multi-class Subgroup Discovery
Subgroup discovery aims at finding subsets of a population whose class distribution is significantly different from the overall distribution. It has previously predominantly been investigated in a two-class context. This paper investigates multi-class subgroup discovery methods. We consider six evaluation measures for multi-class subgroups, four of them new, and study their theoretical properti...
متن کاملExploiting the High Predictive Power of Multi-class Subgroups
Subgroup discovery aims at finding subsets of a population whose class distribution is significantly different from the overall distribution. A number of multi-class subgroup discovery methods has been previously investigated, proposed and implemented in the CN2-MSD system. When a decision tree learner was applied using the induced subgroups as features, it led to the construction of accurate a...
متن کاملLocal Patterns: Theory and Practice of Constraint-Based Relational Subgroup Discovery
This paper investigates local patterns in the multi-relational constraint-based data mining framework. Given this framework, it contributes to the theory of local patterns by providing the definition of local patterns, and a set of objective and subjective measures for evaluating the quality of induced patterns. These notions are illustrated on a description task of subgroup discovery, taking a...
متن کاملRSD: Relational Subgroup Discovery through First-Order Feature Construction
Relational rule learning is typically used in solving classification and prediction tasks. However, relational rule learning can be adapted also to subgroup discovery. This paper proposes a propositionalization approach to relational subgroup discovery, achieved through appropriately adapting rule learning and first-order feature construction. The proposed approach, applicable to subgroup disco...
متن کامل